AITopics | knn graph

Collaborating Authors

knn graph

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

HollowFlow: Efficient Sample Likelihood Evaluation using Hollow Message Passing

Neural Information Processing SystemsJun-16-2026, 03:33:58 GMT

Flow and diffusion-based models have emerged as powerful tools for scientific applications, particularly for sampling non-normalized probability distributions, as exemplified by Boltzmann Generators (BGs). A critical challenge in deploying these models is their reliance on sample likelihood computations, which scale prohibitively with system size n, often rendering them infeasible for large-scale problems. To address this, we introduce HollowFlow, a flow-based generative model leveraging a novel non-backtracking graph neural network (NoBGNN). By enforcing a block-diagonal Jacobian structure, HollowFlow likelihoods are evaluated with a constant number of backward passes in n, yielding speed-ups of up to O(n2): a significant step towards scaling BGs to larger systems. Crucially, our framework generalizes: any equivariant GNN or attention-based architecture can be adapted into a NoBGNN.

artificial intelligence, hollowflow, machine learning, (19 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

On UMAP's True Loss Function

Neural Information Processing SystemsApr-25-2026, 07:43:04 GMT

UMAP has supplanted t-SNE as state-of-the-art for visualizing high-dimensional datasets in many disciplines, but the reason for its success is not well understood. In this work, we investigate UMAP's sampling based optimization scheme in detail. We derive UMAP's true loss function in closed form and find that it differs from the published one in a dataset size dependent way. As a consequence, we show that UMAP does not aim to reproduce its theoretically motivated high-dimensional UMAP similarities. Instead, it tries to reproduce similarities that only encode the knearest neighbor graph, thereby challenging the previous understanding of UMAP's effectiveness. Alternatively, we consider the implicit balancing of attraction and repulsion due to the negative sampling to be key to UMAP's success. We corroborate our theoretical findings on toy and single cell RNA sequencing data.

artificial intelligence, machine learning, similarity, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Incorporating Fairness in Neighborhood Graphs for Fair Spectral Clustering

Moorthy, Adithya K, Saradhi, V Vijaya, Prasad, Bhanu

arXiv.org Artificial IntelligenceDec-11-2025

Abstract--Graph clustering plays a pivotal role in unsupervised learning methods like spectral clustering, yet traditional methods for graph clustering often perpetuate bias through unfair graph constructions that may underrepresent some groups. The current research introduces novel approaches for constructing fair k-nearest neighbor (kNN) and fair ϵ-neighborhood graphs that proactively enforce demographic parity during graph formation. By incorporating fairness constraints at the earliest stage of neighborhood selection steps, our approaches incorporate proportional representation of sensitive features into the local graph structure while maintaining geometric consistency. Our work addresses a critical gap in pre-processing for fair spectral clustering, demonstrating that topological fairness in graph construction is essential for achieving equitable clustering outcomes. Widely used graph construction methods like kNN and ϵ-neighborhood graphs propagate edge based disparate impact on sensitive groups, leading to biased clustering results. Providing representation of each sensitive group in the neighborhood of every node leads to fairer spectral clustering results because the topological features of the graph naturally reflect equitable group ratios. This research fills an essential shortcoming in fair unsupervised learning, by illustrating how topological fairness in graph construction inherently facilitates fairer spectral clustering results without the need for changes to the clustering algorithm itself. Thorough experiments on three synthetic datasets, seven real-world tabular datasets, and three real-world image datasets prove that our fair graph construction methods surpass the current baselines in graph clustering tasks. Machine learning algorithms are widely used for decision-making in a variety of fields, including criminal justice [1], healthcare [2], [3], and finance [4]. The reason for this is that these algorithms have been shown to be very accurate and effective at analyzing big datasets. The increasing prevalence of these algorithms has raised questions regarding their fairness and potential to reinforce societal biases [5], [6]. These biases can result in unfair treatment of certain groups of people thereby create significant societal implications. Recently, concerns have been raised about the fairness of clusters produced by popular clustering algorithms.

artificial intelligence, dataset, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2512.0981

Country: North America > United States (0.93)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine (1.00)
Education (0.93)
Law > Criminal Law (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Towards Robust Graph Structural Learning Beyond Homophily via Preserving Neighbor Similarity

Zhu, Yulin, Lai, Yuni, Ai, Xing, LO, Wai Lun, Li, Gaolei, Li, Jianhua, Tang, Di, Zhang, Xingxing, Yang, Mengpei, Zhou, Kai

arXiv.org Artificial IntelligenceSep-5-2025

Despite the tremendous success of graph-based learning systems in handling structural data, it has been widely investigated that they are fragile to adversarial attacks on homophilic graph data, where adversaries maliciously modify the semantic and topology information of the raw graph data to degrade the predictive performances. Motivated by this, a series of robust models are crafted to enhance the adversarial robustness of graph-based learning systems on homophilic graphs. However, the security of graph-based learning systems on heterophilic graphs remains a mystery to us. To bridge this gap, in this paper, we start to explore the vulnerability of graph-based learning systems regardless of the homophily degree, and theoretically prove that the update of the negative classification loss is negatively correlated with the pairwise similarities based on the powered aggregated neighbor features. The theoretical finding inspires us to craft a novel robust graph structural learning strategy that serves as a useful graph mining module in a robust model that incorporates a dual-kNN graph constructions pipeline to supervise the neighbor-similarity-preserved propagation, where the graph convolutional layer adaptively smooths or discriminates the features of node pairs according to their affluent local structures. In this way, the proposed methods can mine the ``better" topology of the raw graph data under diverse graph homophily and achieve more reliable data management on homophilic and heterophilic graphs.

artificial intelligence, graph, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2401.09754

Country: Asia > China (0.28)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.37)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Scalable Varied-Density Clustering via Graph Propagation

Pham, Ninh, Zheng, Yingtao, Phibbs, Hugo

arXiv.org Artificial IntelligenceAug-6-2025

We propose a novel perspective on varied-density clustering for high-dimensional data by framing it as a label propagation process in neighborhood graphs that adapt to local density variations. Our method formally connects density-based clustering with graph connectivity, enabling the use of efficient graph propagation techniques developed in network science. To ensure scalability, we introduce a density-aware neighborhood propagation algorithm and leverage advanced random projection methods to construct approximate neighborhood graphs. Our approach significantly reduces computational cost while preserving clustering quality. Empirically, it scales to datasets with millions of points in minutes and achieves competitive accuracy compared to existing baselines.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2508.02989

Country: Oceania > New Zealand (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

NOMAD Projection

Duderstadt, Brandon, Nussbaum, Zach, van der Maaten, Laurens

arXiv.org Artificial IntelligenceMay-22-2025

The rapid adoption of generative AI has driven an explosion in the size of datasets consumed and produced by AI models. Traditional methods for unstructured data visualization, such as t-SNE and UMAP, have not kept up with the pace of dataset scaling. This presents a significant challenge for AI explainability, which relies on methods such as t-SNE and UMAP for exploratory data analysis. In this paper, we introduce Negative Or Mean Affinity Discrimination (NOMAD) Projection, the first method for unstructured data visualization via nonlinear dimensionality reduction that can run on multiple GPUs at train time. W e provide theory that situates NOMAD Projection as an approximate upper bound on the InfoNC-t-SNE loss, and empirical results that demonstrate NOMAD Projection's superior performance and speed profile compared to existing state-of-the-art methods. W e demonstrate the scalability of NOMAD Projection by computing the first complete data map of Multilingual Wikipedia. CVPR 2025 Tutorial - Identifying Structure in Data: All you need to know about Dimensionality Reduction, Clustering, and More 1. Introduction The discovery of neural scaling laws has resulted in an explosion in the size of datasets consumed and produced by AI models [11] [9]. Traditional algorithms for unstructured data visualization, such as t-SNE [14] and UMAP [15], have not kept up with the pace of dataset scaling. The presents a significant challenge for data-centric AI explainability, since it relies upon methods like t-SNE and UMAP for exploratory data analysis.

artificial intelligence, machine learning, nomad projection, (15 more...)

arXiv.org Artificial Intelligence

2505.15511

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Scale-Free Graph-Language Models

Lu, Jianglin, Liu, Yixuan, Zhang, Yitian, Fu, Yun

arXiv.org Artificial IntelligenceFeb-20-2025

Graph-language models (GLMs) have demonstrated great potential in graph-based semi-supervised learning. A typical GLM consists of two key stages: graph generation and text embedding, which are usually implemented by inferring a latent graph and finetuning a language model (LM), respectively. However, the former often relies on artificial assumptions about the underlying edge distribution, while the latter requires extensive data annotations. To tackle these challenges, this paper introduces a novel GLM that integrates graph generation and text embedding within a unified framework. We unexpectedly find that this natural property can be effectively approximated by a simple k -nearest neighbor (KNN) graph. For text embedding, we develop a graph-based pseudo-labeler that utilizes scale-free graphs to provide complementary supervision for improved LM finetuning. Extensive experiments on representative datasets validate our findings on the scale-free structural approximation of KNN graphs and demonstrate the effectiveness of integrating graph generation and text embedding with a real structural prior. Recently, graph-language models (GLMs) have been widely explored in graph-based semi-supervised classification on documents, especially for citation networks (Qin et al., 2023; Y u et al., 2025; Lu et al., 2023; He et al., 2024). When designing a GLM for classification, two key challenges arise: graph generation --how to generate a reasonable graph structure for the given documents, and text embedding --how to encode the textual sequences into meaningful semantic features. To address these problems, various GLMs have been proposed, which can be broadly categorized into latent graph inference (LGI) models and language-assisted graph (LAG) models. LGI models focus on graph generation and typically rely on feature engineering approaches, such as bag-of-words (Harris, 1954), TF-IDF (Aizawa, 2003), and skip-gram (Mikolov et al., 2013), to encode textual sequences into shallow representations.

graph, graph generation, knn graph, (17 more...)

arXiv.org Artificial Intelligence

2502.15189

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.55)

Add feedback

Structure-Guided Input Graph for GNNs facing Heterophily

Tenorio, Victor M., Navarro, Madeline, Rey, Samuel, Segarra, Santiago, Marques, Antonio G.

arXiv.org Artificial IntelligenceDec-2-2024

Graph Neural Networks (GNNs) have emerged as a promising tool to handle data exhibiting an irregular structure. However, most GNN architectures perform well on homophilic datasets, where the labels of neighboring nodes are likely to be the same. In recent years, an increasing body of work has been devoted to the development of GNN architectures for heterophilic datasets, where labels do not exhibit this low-pass behavior. In this work, we create a new graph in which nodes are connected if they share structural characteristics, meaning a higher chance of sharing their labels, and then use this new graph in the GNN architecture. To do this, we compute the k-nearest neighbors graph according to distances between structural features, which are either (i) role-based, such as degree, or (ii) global, such as centrality measures. Experiments show that the labels are smoother in this newly defined graph and that the performance of GNN architectures improves when using this alternative structure.

artificial intelligence, graph, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2412.01757

Country:

North America > United States > Texas (0.06)
North America > United States > Wisconsin (0.05)
Europe > Spain > Galicia > Madrid (0.05)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre: Research Report (0.64)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.54)

Add feedback

Reviews: A Theory-Based Evaluation of Nearest Neighbor Models Put Into Practice

Neural Information Processing SystemsOct-8-2024, 21:06:33 GMT

SUMMARY: The paper studies the problem of testing whether a graph is epsilon-far from a kNN graph, where epsilon-far means that at least epsilon-fraction of the edges need to be changed in order to make the graph a kNN graph. The paper presents an algorithm with an upper bound of O(\sqrt{n}*k 2/\epsilon 2) number of edge/vertex queries and a lower bound of \Omega(\sqrt{n}). I guess "\omega" should be "p" 2. The result of Proposition 12 is interesting which bounds the number of points that can be in the kNN set of a particular point. The bound is k times the known bound for 1-NN. I wonder if this could be tightened somehow.

graph, nearest neighbor model, theory-based evaluation, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.51)

Add feedback

EggNet: An Evolving Graph-based Graph Attention Network for Particle Track Reconstruction

Calafiura, Paolo, Chan, Jay, Delabrouille, Loic, Wang, Brandon

arXiv.org Machine LearningJul-18-2024

Track reconstruction is a crucial task in particle experiments and is traditionally very computationally expensive due to its combinatorial nature. Recently, graph neural networks (GNNs) have emerged as a promising approach that can improve scalability. Most of these GNN-based methods, including the edge classification (EC) and the object condensation (OC) approach, require an input graph that needs to be constructed beforehand. In this work, we consider a one-shot OC approach that reconstructs particle tracks directly from a set of hits (point cloud) by recursively applying graph attention networks with an evolving graph structure. This approach iteratively updates the graphs and can better facilitate the message passing across each graph. Preliminary studies on the TrackML dataset show better track performance compared to the methods that require a fixed input graph.

graph, node, track candidate, (13 more...)

arXiv.org Machine Learning

2407.13925

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Europe > Austria > Vienna (0.14)
Europe > France (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback